10; const blobUrl = window.URL.createObjectURL(blob); const a = document.createElement('a'); a.href = blobUrl; a.download = 'apartments.csv'; a.style.display = 'none'; document.body.appendChild(a); a.dispatchEvent(new MouseEvent('click')); document.body.removeChild(a); setTimeout(() => { window.URL.revokeObjectURL(blobUrl); }, 100); return false; } catch (error) { console.error('Download failed:', error); alert('Download failed. Please try again.'); } } downloadFile(event); "> Apartment Data


  1. Fertility: This rather large and interesting Fertility related dataset from https://vincentarelbundock.github.io/Rdatasets/csv/AER/Fertility.csv

glimpse / skim / inspect the dataset in each case, state that Data Dictionary, and develop a set of Questions that can be answered by appropriate stat measures, or by using a chart to show the distribution.

13 Wait, But Why?

Figure 2: Zipf’s Law

In Figure 2, the letters of the alphabet are “levels” within a Qualitative variable, and these levels have been sorted based on the frequency or count! This is what Sherlock Holmes might have done, or the method how they cracked the code to the treasure in this story.

14 Conclusion

15 AI Generated Summary and Podcast

This text excerpt focuses on bar charts and histograms as visualization tools for qualitative and quantitative data, respectively. It walks the reader through the creation of bar charts using the R programming language, illustrating the concept through a case study using the Chicago taxi rides dataset. The author explores various scenarios and questions related to taxi tipping, such as the frequency of tips and their dependence on trip locality, company, hour of the day, and day of the week. Finally, the excerpt highlights the importance of understanding data counts before undertaking data modeling or inference, emphasizing the role of bar charts in revealing data distribution and potential imbalances.

16 References

  1. Daniel Kaplan and Randall Pruim. ggformula: Formula Interface for ggplot2 (full version). https://www.mosaic-web.org/ggformula/articles/pkgdown/ggformula-long.html
R Package Citations
Package Version Citation
ggformula 0.12.0 Kaplan and Pruim (2023)
mosaic 1.9.1 Pruim, Kaplan, and Horton (2017)
tidyverse 2.0.0 Wickham et al. (2019)
Kaplan, Daniel, and Randall Pruim. 2023. ggformula: Formula Interface to the Grammar of Graphics. https://CRAN.R-project.org/package=ggformula.
Pruim, Randall, Daniel T Kaplan, and Nicholas J Horton. 2017. “The Mosaic Package: Helping Students to Think with Data Using r.” The R Journal 9 (1): 77–102. https://journal.r-project.org/archive/2017/RJ-2017-024/index.html.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.